Impact of frame rate on automatic speech-text alignment for corpus-based phonetic studies
نویسندگان
چکیده
Phonetic segmentation is the basis for many phonetic and linguistic studies. As manual segmentation is a lengthy and tedious task, automatic procedures have been developed over the years. They rely on acoustic Hidden Markov Models. Many studies have been conducted, and refinements developed for corpus based speech synthesis, where the technology is mainly used in a speaker-dependent context and applied on good quality speech signals. In a different research direction, automatic speech-text alignment is also used for phonetic and linguistic studies on large speech corpora. In this case, speaker independent acoustic models are mandatory, and the speech quality may not be so good. The speech models rely on 10 ms shift between acoustic frames, and their topology leads to strong minimum duration constraints. This paper focuses on the acoustic analysis frame rate, and gives a first insight on the impact of the frame rate on corpus-based phonetic studies.
منابع مشابه
Enhanced CORILGA: Introducing the Automatic Phonetic Alignment Tool for Continuous Speech
The Corpus Oral Informatizado da Lingua Galega (CORILGA) project aims at building a corpus of oral language for Galician, primarily designed to study the linguistic variation and change. This project is currently under development and it is periodically enriched with new contributions. The long-term goal is that all the speech recordings will be enriched with phonetic, syllabic, morphosyntactic...
متن کاملDIMEx100: A New Phonetic and Speech Corpus for Mexican Spanish
In this paper the phonetic and speech corpus DIMEx100 for Mexican Spanish is presented. We discuss both the linguistic motivation and the computational tools employed for the design, collection and transcription of the corpus. The phonetic transcription methodology is based on recent empirical studies proposing a new basic set of allophones and phonological rules for the dialect of the central ...
متن کاملData Driven Approaches to Phonetic Transcription with Integration of Automatic Speech Recognition and Grapheme-to-Phoneme for Spoken Buddhist Sutra
We propose a new approach for performing phonetic transcription of text that utilizes automatic speech recognition (ASR) to help traditional grapheme-to-phoneme (G2P) techniques. This approach was applied to transcribe Chinese text into Taiwanese phonetic symbols. By augmenting the text with speech and using automatic speech recognition with a sausage searching net constructed from multiple pro...
متن کاملAutomatic phoneme alignment based on acoustic-phonetic modeling
This paper presents a method for speaker-independent automatic phonetic alignment that is distinguished from standard HMM-based “forced alignment” in three respects: (1) specific acoustic-phonetic features are used, in addition to PLP features, by the phonetic classifier; (2) the units of classification consist of distinctive phonetic features instead of phonemes; and (3) observation probabilit...
متن کاملCoALT: A Software for Comparing Automatic Labelling Tools
Speech-text alignment tools are frequently used in speech technology and research. In this paper, we propose a GPL software CoALT (Comparing Automatic Labelling Tools) for comparing two automatic labellers or two speech-text alignment tools, ranking them and displaying statistics about their differences. The main feature of CoALT is that a user can define its own criteria for evaluating and com...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015